Main

Alzheimer’s disease (AD) is the most common age-related neurodegenerative disease characterized by decades-long buildup of amyloid-beta (Aβ) plaques and neurofibrillary tau tangles followed by dementia1. Rates of cognitive decline in AD are extremely heterogeneous, with symptom onset occurring between the ages of 40 and 100 years and conversion from mild cognitive impairment (MCI) to AD dementia occurring in 2–20 years2. While the development of cerebrospinal fluid (CSF) and positron emission tomography (PET) biomarkers of Aβ and tau have begun to untangle this heterogeneity and have thereby improved AD diagnosis, patient stratification and drug development3,4,5,6,7, Aβ and tau still only explain 20–40% of the variance in cognitive impairment (CI) in AD8,9,10,11 (Extended Data Fig. 1a), suggesting the existence of additional drivers of AD dementia. The prevalence of Aβ+ cognitively normal aged individuals further underscores the need for an increased understanding of what drives AD dementia versus cognitive resilience12,13.

The ‘A/T/N’ (Aβ/tau/neurodegeneration) AD biomarker framework14, developed by the National Institute on Aging (NIA) and the Alzheimer’s Association, structures the integration of biomarkers. Among CSF biomarkers, Aβ42 is typically used to define ‘A’ positivity and pTau181 to define ‘T1’ (phosphorylated secreted tau) positivity14. The CSF pTau181:Aβ42 ratio captures both aspects simultaneously15,16. ‘T2’ includes emerging biomarkers of fibrillary tau proteinopathy, such as CSF pT205, CSF MTBR-243 (ref. 17) and tau PET18. The ‘N’ category includes Aβ-independent and tau-independent biomarkers of AD, such as neurofilament light (NfL) for axon degeneration and neurogranin (Ng) for synapse dysfunction5. However, these ‘N’ biomarkers explain only a small additional proportion of variance in CI beyond Aβ and tau5.

To discover new robust Aβ-independent and tau-independent correlates of CI in AD, we performed large-scale proteomics (SomaScan, mass spectrometry (MS)) on the CSF of 3,397 individuals across six deeply phenotyped case–control cohorts with AD spanning both sporadic and autosomal dominant AD (ADAD): Stanford (includes the Stanford Alzheimer’s Disease Research Center (ADRC), Stanford Aging and Memory Study (SAMS) and Poston cohort); Knight-ADRC; Alzheimer’s Disease Neuroimaging Initiative (ADNI); Dominantly Inherited Alzheimer’s Network (DIAN); BioFINDER2; and Kuopio University Hospital (Fig. 1a and Supplementary Table 1). We integrated these CSF proteomics data with CSF and PET biomarkers of Aβ and tau, cognitive function, age, sex, APOE4 genotype and ADAD mutation status to derive a robust CSF biomarker of CI that explains CI beyond existing A/T/N biomarkers. Lastly, we derived a plasma surrogate of the CSF biomarker based on the SomaScan plasma proteomics data from 2,829 individuals from the Knight-ADRC and Religious Order Study/Memory Aging Project (ROSMAP) cohorts and evaluate the signature in an additional 9,502 individuals from Stanford, the Global Neurodegeneration Proteomics Consortium (GNPC) (https://www.neuroproteome.org/) and the Atherosclerosis Risk in Communities (ARIC) study.

Fig. 1: The CSF YWHAG:NPTX2 ratio explains a substantial proportion of variance in CI beyond amyloid and tau in AD.
figure 1

a, Study design. Integration of CSF proteomics, AD pathology biomarkers and clinical cognitive scoring from six independent cohorts to identify the molecular correlates of CI, independently of AD pathology. b, Volcano plot showing the change with CI independent of age, sex, cohort, APOE4 dose and pTau181:Aβ42, and PC1 of the CSF proteome, in the Knight-ADRC and ADNI cohorts (n = 1,472). Bold indicates synapse proteins based on the SynGO database; q values are Benjamini–Hochberg-corrected P values. c, Rank-based pathway enrichment heatmap of differentially abundant proteins. Cells are color-coded according to − log10(q). d, A penalized linear model was trained to predict CI severity using synaptic proteins that significantly changed with CI. RFE showed that two proteins sufficiently captured 83% of the full model performance. Model coefficients show the normalized ratio between YWHAG.1 and NPTX2. e, Box plot showing YWHAG.1:NPTX2 versus CI severity across cohorts with the SomaScan data (n = 2,067). The box bounds are the Q1, median and Q3; the whiskers show Q1 − 1.5× the interquartile range (IQR) and Q3 + 1.5× the IQR. f, CSF YWHAG.1:NPTX2 regressed against age, CI, sex and cohort in a linear model (n = 2,067). The points and error bars represent the standardized effect sizes and 95% confidence intervals. g, AUC results from the logistic regression to classify CI stage based on YWHAG.1:NPTX2 or pTau181:Aβ42. All SomaScan cohorts are included (n = 2,067). h, YWHAG.1:NPTX2 versus pTau181:Aβ42, color-coded according to CI. Linear correlation and P value are shown. i, r2 results from a linear model regressing CI against covariates displayed on the x axis in A+T1+ individuals (n = 898). The difference in r2 values between the full model and the model with only pTau181:Aβ42 is shown. j, As in h but for YWHAZ:NPTX2 versus tau PET in Aβ+ individuals in BioFINDER2. k, As in i but with different covariates in BioFINDER2 (n = 512). The difference of r2 values between the full model and the model with only Aβ42:Aβ40 + tau PET is shown. The bars and error bars in g, i and k represent bootstrapped (n = 1,000) means and 95% confidence intervals. Two-sided P values were calculated using the empirical distribution of the bootstrapped test statistic. **P < 0.01, ***P < 0.001. Graphics in a created with BioRender.com.

Multicohort CSF proteomics for AD biomarker discovery

We performed proteomics on 3,397 CSF samples (3,187 with a complete CI diagnosis) from six independent cohorts. To identify CSF proteins that explained additional variance of CI beyond AD pathology, we regressed the global clinical dementia rating (CDR) (a clinical CI score) against CSF protein levels, while adjusting for CSF pTau181:Aβ42, age, sex, APOE4, cohort and principal component 1 (PC1) of the proteome (Methods). We analyzed the SomaScan proteomics data (7,289 protein measurements per sample) from the Knight-ADRC (n = 756) and ADNI (n = 716) cohorts for discovery.

We identified 675 significantly (Benjamini–Hochberg q < 0.05) upregulated and 721 significantly downregulated proteins with CI (Fig. 1b and Supplementary Tables 2 and 3). The most significant proteins were enriched at the synapse (based on the SynGO database19; Fig. 1c). The most upregulated synapse proteins included YWHAG, YWHAZ, YWHAH, NEFL, NEFH, DLG2, HOMER1, MAP1LC3A, PPP3CA and PPP3R1. The YWHA family (which encode 14-3-3) proteins, DLG2 and calcineurin subunits (PPP3CA and PPP3R1) were strongly associated with pTau181:Aβ42 (ref. 20) (Extended Data Fig. 1b,c). In line with these observations, Aβ42 signaling promoted calcineurin activity21, and inhibition of calcineurin activity protected from Aβ-induced and tau-induced synapse loss and CI in mice22,23. Notably, SMOC1, an extracellular matrix protein previously linked to AD and Aβ plaques24,25, was not associated with CI after pTau181:Aβ42 adjustment (Extended Data Fig. 1b).

The most downregulated proteins included NPTX2, NPTXR, SLITRK1, CBLN4, LRFN2 and EPHA4. These proteins were weakly negatively associated with pTau181:Aβ42 (ref. 20) (Extended Data Fig. 1b,c). The most downregulated protein was NPTX2, encoded by an immediate early gene that regulates homeostatic scaling of excitatory synapses on parvalbumin interneurons26 to prevent neuronal network hyperactivity27. In line with its reduction in AD CSF, NPTX2 mRNA and protein are downregulated in AD neurons based on brain single-cell RNA sequencing, immunohistochemistry and bulk proteomic studies28,29. Interestingly, overexpression of NPTX2 tau P301S in the mouse hippocampus protects synapses from complement-mediated glial engulfment30, suggesting that it may be a synaptic resilience factor.

Given the enrichment of synapse proteins, we sought to derive a multiprotein synaptic signature of CI. Using the ADNI data, we trained a penalized linear model to predict CI severity based on the levels of 214 synapse proteins that significantly changed with CI. We used recursive feature elimination (RFE) to further simplify the model to facilitate clinical applications (Fig. 1d). The model identified a near 1:1 difference between YWHAG and NPTX2 to be a suitable signature of CI (Fig. 1d). As we log-normalized the z-scored protein levels before the analyses, the difference between normalized protein levels represents a normalized ratio. The figures refer to YWHAG.1, a specific YWHAG proteoform detected by the SomaLogic aptamer with SeqId: 4179-57. Notably, ratios between CSF YWHA family proteins and NPTX2 based on MS have been previously associated with general AD phenotypes31,32, suggesting reproducibility across cohorts and proteomic platforms.

We validated the association of YWHAG:NPTX2 with CI across all cohorts with SomaScan data (ADNI r = 0.54; Knight-ADRC r = 0.55; Stanford r = 0.62; DIAN r = 0.66; total n = 2,067) including both sporadic AD and ADAD (Fig. 1e and Supplementary Table 4). Correlations were consistent across sexes (Extended Data Fig. 1d) and slightly exceeded the correlations of pTau181:Aβ42 with CI (Extended Data Fig. 1e). The YWHAG:NPTX2 ratio was not significantly affected by the cohort (Fig. 1f), making it advantageous for clinical applications.

To assess the potential of using YWHAG:NPTX2 for cognitive diagnosis, we tested logistic regression models that aimed to distinguish individuals across different CI stages. We found that while pTau181:Aβ42 had better performance in distinguishing individuals with MCI versus cognitively normal individuals, YWHAG:NPTX2 had better performance in the later stages (Fig. 1g, Extended Data Fig. 1f and Supplementary Table 5). When tasked to distinguish individuals with dementia versus cognitively normal individuals, YWHAG:NPTX2 with an area under the curve (AUC) of 0.97 was predictive (Fig. 1g and Extended Data Fig. 1g).

We next sought to determine the robustness of YWHAG:NPTX2 in explaining CI beyond AD pathology. We observed that while YWHAG:NPTX2 and pTau181:Aβ42 were correlated (r = 0.61), low and high levels of YWHAG:NPTX2 further separated A+T1-positive (A+T1+) (log10 pTau181:Aβ42 > −1; Methods) individuals into no impairment versus dementia, respectively (Fig. 1h). Among A+T1+ individuals (n = 898), 62% of individuals with low levels of YWHAG:NPTX2 (bottom 25th percentile) were cognitively normal and 37% had only MCI, whereas only 4% of individuals with high YWHAG:NPTX2 (top 25th percentile) were cognitively normal and 46% had dementia (Extended Data Fig. 2a). This pattern was consistent across cohorts and both sporadic AD and ADAD. Using linear regression, we found that pTau181:Aβ42 explained 10% of the variance in CI in A+T1+ individuals, YWHAG:NPTX2 explained 36% and YWHAG:NPTX2 explained an additional 27% beyond pTau181:Aβ42 (Fig. 1i). YWHAG:NPTX2 was significantly associated with CI independently of pTau181:Aβ42 and age in A+T1+ individuals across all cohorts and proteomic platforms (Extended Data Fig. 2b). Notably, in the DIAN ADAD cohort, YWHAG:NPTX2 was associated with CI even after accounting for the estimated age at symptom onset (Extended Data Fig. 2b).

While pTau181:Aβ42 is a robust biomarker of Aβ plaques and phosphorylated secreted tau, it is not well representative of tau tangle load (T2), which is known to correlate with CI more strongly18. Therefore, we analyzed the BioFINDER2 cohort; we performed time-point-matched targeted CSF synapse protein MS proteomics, tau PET imaging and CSF Aβ42:Aβ40 measurement32. As YWHAG was not measured, we analyzed YWHAZ, a related protein (r = 0.92; Extended Data Fig. 2c), which was also associated with CI albeit not as strongly (Fig. 1b and Extended Data Fig. 2d). We confirmed the association of YWHAZ:NPTX2 with CI based on MS in BioFINDER2 (r = 0.63; Extended Data Fig. 2e). We plotted YWHAZ:NPTX2 versus tau PET in Aβ+ individuals (n = 512), color-coded according to CI severity (Fig. 1j). We observed several interesting patterns. First, we found a moderate correlation between YWHAZ:NPTX2 and tau PET (r = 0.45). Second, we observed that all individuals with above moderate levels of tau had above moderate levels of YWHAZ:NPTX2, but not vice versa, suggesting that YWHAZ:NPTX2 may change before tau during AD progression. Third, we observed that YWHAZ:NPTX2 and tau PET independently explained CI severity. Using linear regression, we found that Aβ42:Aβ40 and tau PET together explained 35% of the variance in CI in Aβ+ individuals; YWHAZ:NPTX2 explained an additional 11% beyond Aβ42:Aβ40 and tau PET (Fig. 1k). The association of YWHAZ:NPTX2 with CI was robust to additional adjustment with age, APOE4 dose, sex and CSF NfL (Extended Data Fig. 2f,g). Although YWHAG:NPTX2 and YWHAZ:NPTX2 were highly correlated (r = 0.94) in SomaScan (Extended Data Fig. 2c), additional studies are needed to confirm the association of YWHAG:NPTX2 with CI independently of tau PET.

Together, these results show that CSF synapse proteins, some with established causal roles in synaptic and cognitive resilience to AD pathology in mouse models (that is, calcineurin, NPTX2), are among the strongest correlates of CI severity independent of Aβ and tau in humans, and that the CSF YWHAG:NPTX2 ratio is a synapse protein signature that explains a major proportion of variance in CI in AD beyond gold standard biomarkers of Aβ and tau.

CSF YWHAG:NPTX2 versus established neurodegeneration AD biomarkers

While we observed that CSF YWHAG:NPTX2 explained CI beyond Aβ and tau, whether it also explained CI beyond established biomarkers of neurodegeneration and synapse dysfunction was unknown. Therefore, we compared CSF YWHAG:NPTX2 with CSF NfL, growth-associated protein 43 (GAP-43) and Ng5, which were also measured on the SomaScan assay. We confirmed that the levels of these biomarkers based on SomaScan were highly correlated with levels based on established immunoassays used in AD research33,34 (Supplementary Fig. 2). We further confirmed that the levels of these biomarkers were higher in AD and A+T1+ individuals compared to controls in the Knight-ADRC and ADNI cohorts (Fig. 2a).

Fig. 2: CSF YWHAG:NPTX2 versus established neurodegeneration AD biomarkers.
figure 2

a, Levels of CSF NfL, GAP-43, Ng and YWHAG.1:NPTX2 between AD and healthy and A+T1+ and A–T1− in the Knight-ADRC and ADNI cohorts are shown. ***P < 0.001 based on a standard two-sided t-test. The box bounds show the Q1, median and Q3; the whiskers show the Q1 − 1.5× the IQR and Q3 + 1.5× the IQR. b, Linear correlation and P value between CSF YWHAG.1:NPTX2 and CSF NfL in the Knight-ADRC and ADNI cohorts. The colors indicate the CI stage as shown in Fig. 1e. c, As in b but for CSF GAP-43. d, As in b but for CSF Ng. e, r2 results from linear models regressing CI against covariates displayed on the x axis in the Knight-ADRC and ADNI cohorts (n = 1,472). The bars and error bars represent bootstrapped (n = 1,000) means and 95% confidence intervals. Two-sided P values were calculated using the empirical distribution of the bootstrapped test statistic. The difference in r2 values between the full model and the model with NfL, GAP-43 and Ng is shown. ***P < 0.001.

We examined pairwise correlations between YWHAG:NPTX2 and these ‘N’ biomarkers in the Knight-ADRC and ADNI cohorts (total n = 1,472) and found that YWHAG:NPTX2 was notably distinct from the rest (Fig. 2b–d). To our surprise, CSF YWHAG:NPTX2 was slightly negatively correlated with GAP-43 (r = −0.07) and Ng (r = −0.09), even though all biomarkers were positively correlated with AD.

Regarding associations with CI, we observed that NfL, GAP-43 and Ng each and together explained only a small proportion of the variance in CI (1–5%) in A+T1+ individuals (Fig. 2e). Conversely, YWHAG:NPTX2 explained 31% of the variance, 28% beyond NfL, GAP-43 and Ng.

These results suggest that the levels of YWHAG:NPTX2 represent a ‘pathology’ that is distinct from previously reported AD neurodegeneration and synapse dysfunction biomarkers and one that is much more closely related to CI.

CSF YWHAG:NPTX2 in normal aging and ADAD

As age is the strongest risk factor for AD onset, we next asked whether YWHAG:NPTX2 increases during normal aging before CI. Surprisingly, we found that YWHAG:NPTX2 increased with age not only in later decades, but also in the earliest decades of adulthood, ~30 years before changes in pTau181:Aβ42 (Fig. 3a). This pattern was replicated in the BioFINDER2 cohort (Extended Data Fig. 3a).

Fig. 3: CSF YWHAG:NPTX2 ratio increases with normal aging and presymptomatic ADAD.
figure 3

a, Changes with age of YWHAG.1:NPTX2 and pTau181:Aβ42 in cognitively normal non-ADAD mutation carriers. Locally weighted scatterplot smoothing regression lines with 95% confidence intervals are shown. b, Changes with age of YWHAG.1:NPTX2 in cognitively normal individuals aged under 55 years stratified according to ADAD mutation carrier status. P values from a linear model regressing YWHAG.1:NPTX2 against ADAD carrier status, age and their interaction are shown. Linear regression lines with 95% confidence intervals for carriers versus noncarriers are shown. c, Association between mean EAO and slope of YWHAG.1:NPTX2 change with age. Spearman correlation and P value are shown. Data from noncarriers are shown for comparison. The linear regression line with the 95% confidence intervals is shown. d, Changes with age of YWHAG.1:NPTX2 in cognitively normal non-ADAD mutation carriers stratified according to APOE genotype. P values from a linear model regressing YWHAG.1:NPTX2 against APOE4 dose, age and their interaction are shown. Linear regression lines with 95% confidence intervals for specified APOE genotypes are shown. e, Box plot showing changes in YWHAG.1:NPTX2 across different age groups and CI stages (n = 1,846). The box bounds show the Q1, median and Q3; the whiskers show Q1 − 1.5× the IQR and Q3 + 1.5× the IQR; s.d. changes in YWHAG.1:NPTX2 with cognitively normal aging and cognitive decline are shown. f, Changes with estimated years to symptom onset (EYO) of YWHAG.1:NPTX2 stratified according to ADAD carrier status. ADAD carrier points are color-coded according to CI stage as in Fig. 1e. Linear regression lines with 95% confidence intervals for carriers versus noncarriers before and after estimated symptom onset are shown. Slopes for carriers are shown. g, Changes with age and CI stage of YWHAG.1:NPTX2 for all individuals. Points are color-coded according to CI stage as in Fig. 1e and sized according to Aβ positivity. h, Schematic of proposed model showing that changes in YWHAG.1:NPTX2 with cognitively normal aging underlie age at AD onset.

To determine whether changes with age in YWHAG:NPTX2 precede AD symptom onset, we leveraged data from ADAD mutation carriers in the DIAN cohort who have genetically determined early-onset AD. Specifically, we tested whether YWHAG:NPTX2 had a steeper age-related increase in presymptomatic carriers versus noncarriers. We tested a linear model regressing YWHAG:NPTX2 against carrier status, age and their interaction, among cognitively normal individuals aged under 55 years, the age range where noncarriers are A–T1− (Fig. 3a). We found that carriers had significantly higher YWHAG:NPTX2 (ref. 35) (P = 7.21 × 10−13) and a steeper age-related increase (two times increase in slope, interaction P = 0.032) (Fig. 3b).

ADAD mutations have varying degrees of severity, with estimated ages at symptom onset ranging from 25 to 65 years depending on the mutation25,36. To determine whether age-related slopes of YWHAG:NPTX2 correlate with ages at symptom onset, we grouped presymptomatic carriers into bins based on estimated age at onset (EAO) (<35, 35–45, 45–55, 55–65) and calculated age-related YWHAG:NPTX2 slopes per bin. We observed a strong negative correlation between mean EAO per bin and age-related YWHAG:NPTX2 slopes (Spearman r = − 0.9, P = 0.037), whereby those with earlier ages at symptom onset had steeper age-related increases in YWHAG:NPTX2 (Fig. 3c and Extended Data Fig. 3b).

We then examined the effects of the APOE genotype, the leading genetic risk factor for sporadic AD, on age-related YWHAG:NPTX2 slopes. We regressed YWHAG:NPTX2 against APOE4 (high-risk allele) dose, age and their interaction in cognitively normal individuals across the lifespan from the Knight-ADRC, ADNI and Stanford SomaScan cohorts. Like the ADAD mutation carrier status, APOE4 was significantly associated with higher YWHAG:NPTX2 (P = 7.50 × 10− 6) and a steeper increase in YWHAG:NPTX2 with age (33% increase in slope compared to APOE3/3 homozygotes, P = 3.40 × 10−3; Fig. 3d). APOE2 (protective allele) carriers showed no significant differences, although we suspect this may be because of the limited sample size.

Our analyses thus far revealed YWHAG:NPTX2 increases with normal aging and presymptomatic AD as well as CI severity during AD progression. We next compared the degrees to which YWHAG:NPTX2 increases during these two phases. We found that YWHAG:NPTX2 increased by 1.95 s.d. over 60 years of normal aging and then 2.66 s.d. from cognitively normal aged to late-stage dementia (Fig. 3e). Although cognitively normal versus dementia groups were age-matched given the nature of case–control studies, assuming ~20 years for progression from Aβ+ cognitively normal to AD dementia based on population-based studies37,38, our data suggest that AD progression mimics 82 years of ‘normal’ age-related increases in YWHAG:NPTX2, representing a stark 4.1-time increase in slope during AD progression compared to normal aging.

We examined this phenomenon in ADAD by plotting YWHAG:NPTX2 versus EYO. We compared YWHAG:NPTX2 slopes before and after estimated symptom onset and, similar to our estimates in sporadic AD, we observed a 3.4-time increase in the YWHAG:NPTX2 slope during ADAD symptom progression compared to the presymptomatic phase (Fig. 3f). Notably, YWHAG:NPTX2 increased in ADAD ~22 years before estimated symptom onset (Fig. 3f).

To obtain a bird’s-eye view of all these data, we plotted YWHAG:NPTX2 versus age color-coded according to CI stage and sized according to AT1 status for ADAD carriers versus noncarriers (Fig. 3g). We confirmed the extremely accelerated increase in YWHAG:NPTX2 among ADAD mutation carriers leading up to early-onset AD, as well as the widespread heterogeneity in noncarriers leading to sporadic AD in some and cognitive maintenance in others, despite amyloid positivity and old age (Fig. 3g,h).

Collectively, these results demonstrate that YWHAG:NPTX2, a robust correlate of CI severity in AD, substantially increases with cognitively normal aging and presymptomatic AD.

CSF YWHAG:NPTX2 associations with future AD progression

We next sought to determine the potential clinical utility of YWHAG:NPTX2 in predicting future AD onset and progression. First, we leveraged Aβ and tau PET imaging data collected 4–15 years after CSF draw in the ADNI cohort to assess whether YWHAG:NPTX2 predicts Aβ-driven tau accumulation (Fig. 4a). Using linear regression, we found that YWHAG:NPTX2 modified the future association between Aβ and tau PET (YWHAG:NPTX2 × Aβ PET interaction, P = 6.84 × 10−4), adjusting for baseline CI, pTau181:Aβ42, age, sex and APOE4 (n = 120). Among individuals with high future Aβ load, high baseline YWHAG:NPTX2 was associated with higher future tau PET, while low YWHAG:NPTX2 was associated with limited Aβ-related tau PET increase (Fig. 4b). These results align with previous studies showing that Aβ combined with synapse dysfunction and neuronal hyperactivity drives tau accumulation and propagation39,40, and a study showing that levels of CSF GAP-43 modify the rate of Aβ-driven tau accumulation41.

Fig. 4: CSF YWHAG:NPTX2 ratio predicts future tau accumulation and cognitive decline beyond Aβ, tau, NfL, GAP-43 and Ng.
figure 4

a, ADNI cohort analyses for bd, correlating baseline CSF YWHAG.1:NPTX2 and pTau181:Aβ42 with future amyloid and tau PET imaging and cognitive scoring data. b, Future tau PET (global standardized uptake value ratio (SUVR)) versus future amyloid PET (global centiloid), color-coded according to the percentiles of YWHAG.1:NPTX2. Linear regression lines with 95% confidence intervals for specified YWHAG.1:NPTX2 percentiles are shown. c, Future ADAS13 cognitive score versus future tau PET (global SUVR) color-coded according to the percentiles of YWHAG.1:NPTX2 in Aβ+ individuals. d, r2 results from linear models regressing CI against covariates displayed on the x axis (n = 70). The bars and error bars represent bootstrapped (n = 1,000) means and 95% confidence intervals. Two-sided P values were calculated via the empirical distribution of the bootstrapped test statistic. The difference between r2 values between the two models is shown. ***P < 0.001. e, ADNI, Knight-ADRC and Stanford analyses for fk, associating baseline CSF YWHAG.1:NPTX2 with future cognitive decline. f, Cox proportional hazards regression was used to associate YWHAG.1:NPTX2 with future cognitive decline in A+T1+ individuals with MCI to mild dementia, while adjusting for pTau181:Aβ42, CSF NfL, CSF Ng, APOE4, age, sex and CI stage. Results from a cross-cohort, fixed effects meta-analysis is shown (total n = 520). The points and error bars represent HRs and 95% confidence intervals. g, As in f but for predicting dementia onset in A+T1+ cognitively normal individuals (total n = 171). CI stage was not included as a covariate because all individuals were cognitively normal. h, Cox proportional hazards regression was used to associate YWHAG.1:NPTX2 with future cognitive decline in A+T1+ individuals, while adjusting for pTau181:Aβ42, CSF NfL, CSF Ng, APOE4, age, sex and CI stage in the combined sample (n = 697). The points and error bars represent the HRs and 95% confidence intervals for each covariate. i, Kaplan–Meier curves with 95% confidence intervals showing the rates of future cognitive decline in A+T1+ versus A–T1− individuals. The HR and 95% confidence intervals is shown. j, As in i but for YWHAG.1:NPTX2high (top 25th percentile) versus YWHAG.1:NPTX2low (bottom 25th percentile) individuals. k, As in i but for A+T1+ YWHAG.1:NPTX2high versus A–T1− YWHAG.1:NPTX2low individuals. Graphics in a and e created with BioRender.com.

More important than predicting future tau tangle buildup is predicting future cognitive decline. We plotted future Alzheimer’s Disease Assessment Scale-13 (ADAS13) cognitive score versus future tau load, color-coded according to baseline YWHAG:NPTX2 in future Aβ PET+ individuals (n = 70) (Fig. 4c). ADAS13 was chosen for its sensitivity and dynamic range over global CDR. Among individuals with low-to-mild tau buildup, we observed that YWHAG:NPTX2 distinguished cognitively normal versus impaired individuals (Fig. 4c). All individuals with high tau PET had high YWHAG:NPTX2. Using linear regression, we found that pTau181:Aβ42, Aβ PET and tau PET together explained 41% of the variance in ADAS13 in A+T1+ individuals, and YWHAG:NPTX2 explained an additional 13% (Fig. 4d). We confirmed that YWHAG:NPTX2 was significantly associated with future ADAS13, while adjusting for tau tangle load and several additional covariates (Extended Data Fig. 4a).

To more broadly assess whether YWHAG:NPTX2 could predict future cognitive decline independent of Aβ and tau, we used data from all cohorts with longitudinal cognitive follow-up (ADNI, Knight-ADRC, Stanford; Fig. 4e). Global CDR CI staging was used because it was measured across all cohorts. We analyzed both dementia progression from an MCI to mild dementia baseline, as well as dementia onset from a cognitively normal baseline.

Using Cox proportional hazards regression, we tested the association of YWHAG:NPTX2 with a future CI stage increase among A+T1+ individuals with MCI to mild dementia over 1–15 years, adjusting for baseline CI, pTau181:Aβ42, CSF NfL, CSF Ng (CSF GAP-43 was redundant with Ng; Supplementary Fig. 2c), age, sex and APOE4 dose in each cohort (total n = 520). YWHAG:NPTX2 significantly predicted future cognitive decline across all cohorts; in a meta-analysis, an s.d. deviation increase in YWHAG:NPTX2 conferred a 124% increased risk of cognitive decline (meta hazard ratio (HR) = 2.24, meta P = 8.16 × 10− 16; Fig. 4f and Supplementary Table 6).

We then tested whether YWHAG:NPTX2 could predict dementia onset in A+T1+ cognitively normal individuals, adjusting for pTau181:Aβ42, CSF NfL, CSF Ng, age, sex and APOE4 dose (the Stanford cohort was not included because of a low event sample size, total n = 171). YWHAG:NPTX2 significantly predicted dementia onset across all cohorts; in a meta-analysis, an s.d. increase in YWHAG:NPTX2 conferred a 197% increased risk of conversion from cognitively normal to dementia (meta HR = 2.97, meta P = 7.03 × 10− 4; Fig. 4g and Supplementary Table 7).

Aggregating data from all A+T1+ cognitively normal, MCI and mild dementia individuals across cohorts (total n = 697), YWHAG:NPTX2 was the strongest predictor of future cognitive decline among covariates (HR = 2.26, P = 2.24 × 10− 22; Fig. 4h). Notably, YWHAG:NPTX2 was more strongly associated with future cognitive decline than the established biomarkers pTau181:Aβ42, CSF NfL and CSF Ng in the multivariate model.

To aid patient stratification, we binned individuals into binary high and low groups (upper and lower 25th percentiles) and tested AT1 status and YWHAG:NPTX2 status in predicting future cognitive decline, individually and together. As done previously, we aggregated data from all A+T1+ cognitively normal, MCI and mild dementia across cohorts. Based on AT1 status alone, we found that A+T1+ individuals had a four-time increased risk of future cognitive decline compared to A–T1-negative (A–T1−) individuals (HR = 3.96, P = 5.94 × 10− 16; Fig. 4i). Surprisingly, based on YWHAG:NPTX2 status alone, YWHAG:NPTX2high individuals had a striking 15-time increased risk of future cognitive decline compared to YWHAG:NPTX2low individuals (HR = 15.36, P = 8.04 × 10− 48; Fig. 4j). Combining both biomarkers, A+T1+ and YWHAG:NPTX2high individuals had a 19-time increased risk of future cognitive decline compared to A–T1− YWHAG:NPTX2low individuals (HR = 18.87, P = 3.74 × 10− 25; Fig. 4k). No additional covariates were included in these Cox models, demonstrating the power of these biomarkers alone in predicting future cognitive decline versus maintenance.

Together, these results demonstrate that YWHAG:NPTX2 provides additional prognostic clinical utility beyond gold standard AD biomarkers.

Five defined CSF YWHAG:NPTX2 groups for cognitive prognosis

Given that percentiles may vary depending on the cohorts analyzed, we aimed to establish YWHAG:NPTX2 thresholds for stratifying individuals into defined low and high groups, which can be validated in future studies and potentially applied in clinical trials. While A/T/N status is typically binary (positive versus negative), we hypothesized that YWHAG:NPTX2, which correlates with increasing CI stages (five groups), could define multiple groups with varying risk of cognitive decline.

We used the Youden index from binary logistic regression models (MCI versus normal, mild dementia versus MCI, moderate versus mild dementia; Fig. 1g) to define four YWHAG:NPTX2 groups tracking CI stages (0, no CI; 1, MCI; 2, mild dementia; 3, moderate-to-severe dementia; Fig. 5a). Moderate and severe dementia were combined because of the limited sample size. Additionally, we defined a ‘−1’ group representing predominantly cognitively normal individuals (95% sensitivity for MCI versus normal).

Fig. 5: Defined YWHAG:NPTX2 groups predict future cognitive resilience versus decline.
figure 5

a, YWHAG.1:NPTX2 subgroups were defined using logistic regression to predict CI stage in individuals in adjacent CI stages. The Youden index was used to define cutoffs. Moderate and severe dementia were combined because of the limited sample size. An additional −1 group was defined based on 95% sensitivity in the model classifying MCI versus cognitively normal. The colors of the distributions indicate the CI stage as shown in Fig. 1e. b, Kaplan–Meier curves with 95% confidence intervals showing the probabilities of cognitive maintenance (no change in CI stage) among MCI A+T1+ individuals (Knight-ADRC and ADNI cohorts) stratified according to the YWHAG.1:NPTX2 groups. HRs and 95% confidence intervals from a cross-cohort, fixed effects meta-analysis comparing groups to the ‘correctly’ classified group, adjusted for pTau181:Aβ42, CSF NfL, CSF Ng, age, APOE4 and sex are shown (total n = 397). c, As in b but in cognitively normal A+T1+ individuals (total n = 168). CI stage was not included as a covariate because all individuals were cognitively normal.

Interestingly, we observed that many individuals were classified into YWHAG:NPTX2 groups that did not match their diagnosis (Fig. 5a). We hypothesized these ‘misclassifications’ may provide prognostic insights into future cognitive maintenance or decline. We focused on MCI and cognitively normal A+T1+ individuals, which are key populations for AD dementia prevention clinical trials.

First, we stratified MCI A+T1+ individuals in the Knight-ADRC and ADNI cohorts (n = 397) into these YWHAG:NPTX2 groups and determined their rates of future cognitive decline (defined as an increase in CI stage) over 15 years using a Cox proportional regression meta-analysis. Interestingly, we found that relative to ‘correctly’ classified individuals with MCI (group 1), those classified into the ‘cognitively normal’ group (group 0) had a 73% reduced risk of cognitive decline (meta HR = 0.27, meta P = 7.35 × 10− 6), while those classified into the ‘mild dementia’ group (group 2) had a 2.3-time increased risk of cognitive decline (meta HR = 2.30, meta P = 2.74 × 10− 6), adjusted for pTau181:Aβ42, CSF NfL, CSF Ng, age, sex and APOE4 (Fig. 5b, Extended Data Fig. 5a and Supplementary Table 8). Specifically, only 17 of 77 (22%) individuals in group 0 experienced cognitive decline, while 93 of 133 (70%) in group 2 declined over 15 years (Fig. 5b and Supplementary Table 8).

Next, among cognitively normal A+T1+ individuals (n = 168), those in the ‘MCI’ group (group 1) had a 2.7-time increased risk of cognitive decline (meta HR = 2.72, meta P = 7.16 × 10− 4), and those in the ‘mild dementia’ group (group 2) had a 4.9-time increased risk of cognitive decline (meta HR = 4.92, meta P = 2.39 × 10− 3) compared to ‘correctly’ classified cognitively normal individuals (group 0), adjusted for pTau181:Aβ42, CSF NfL, CSF Ng, age, sex and APOE4 (Fig. 5c, Extended Data Fig. 5b and Supplementary Table 9). Notably, all cognitively normal individuals or individuals with MCI in the high confidence cognitively normal group (group −1) maintained cognition over 10 years, while almost all in the ‘moderate-to-severe dementia’ group (group 3) declined within 1 year, although sample sizes were too small for rigorous statistics (Fig. 5b,c).

Overall, we demonstrate that individuals with the same clinical diagnosis and AT1 positivity can be further stratified into defined YWHAG:NPTX2 groups, which are strongly associated with future cognitive outcomes. Thresholds defining these groups are provided in Methods for future validation and exploration in clinical trials.

Partial plasma proteomic surrogate of CSF YWHAG:NPTX2

While CSF biomarkers provide important insights for AD research and in the clinic, the invasiveness of CSF extraction limits widespread clinical use. Thus, we sought to derive a plasma proteomics-based biomarker of CI that could recapitulate CSF YWHAG:NPTX2. We performed SomaScan plasma proteomics on 4,245 samples from the Knight-ADRC, Stanford and ROSMAP cohorts (Supplementary Table 10); 3,899 samples had complete CI diagnosis and 519 samples from the Knight-ADRC and Stanford cohorts were collected within 6 months of CSF samples from the same individuals, enabling direct plasma–CSF comparisons.

We first tested the correlations between plasma YWHAG:NPTX2 with CI and CSF YWHAG:NPTX2 and found no significant correlations (Supplementary Fig. 3a). We then systematically tested several frameworks to optimize correlations between the plasma signature with CI and CSF YWHAG:NPTX2 (Supplementary Methods and Supplementary Figs. 3–5). Briefly, the optimal framework used a penalized linear model trained on plasma protein levels to predict CI based on a subset of plasma proteins that were (1) enriched for synapse proteins associated with CI in CSF, (2) unaffected by cohort and (3) not subject to putative APOE genotype-based proteoform-aptamer binding alterations. We trained the plasma signature on unique patient samples from the Knight-ADRC (n = 1,969) and ROSMAP (n = 860) cohorts and tested on unique patient samples from the Stanford cohort (n = 600). We also tested the signature in the GNPC (F. B. Imam et al., manuscript in preparation), which encompasses over 40,000 patient samples from over 20 international research groups. Specifically, we tested the signature on a subset of 2,872 unique patient samples with complete CI information (global CDR) and without a diagnosis of non-AD neurodegeneration (five independent cohorts; Knight-ADRC, Stanford and ROSMAP not included).

The plasma signature (Supplementary Table 11) correlated with CI across all training and test cohorts (Knight-ADRC r = 0.66; ROSMAP r = 0.62; Stanford r = 0.51; GNPC-L r = 0.54; GNPC-E r = 0.47; GNPC-P r = 0.23; GNPC-I r = 0.68; GNPC-N r = 0.54; total n = 6,301; Fig. 6a and Supplementary Table 12). Proteins with strong weights included CPLX2, PTPRD, PI3, MAG and PTGDS (increased with CI), and NPTXR, SEZ6L, CD93, TPPP3 and PIANP (decreased with CI) (Supplementary Table 11). Notably, CPLX2, PTPRD, NPTXR (the receptor for NPTX2) and SEZ6L are synaptic proteins, confirming synapse protein associations with CI across both CSF and plasma (Fig. 6b). The plasma signature correlated with CSF YWHAG:NPTX2 (Knight-ADRC r = 0.57; Stanford r = 0.53; Fig. 6c), with stronger correlations observed in individuals with some degree of CI (CI ≥ MCI r = 0.64; CI = none r = 0.28; Fig. 6c).

Fig. 6: The plasma proteomic signature of CI partly recapitulates the CSF YWHAG:NPTX2 ratio, predicting AD onset and progression.
figure 6

a, Box plot showing plasma signature versus CI severity across cohorts (total n = 6,301). The box represent the Q1, median and Q3; the whiskers show Q1 − 1.5× the IQR and Q3 + 1.5× the IQR. Pearson correlations are shown. b, Protein coefficients for the plasma signature. c, Correlations between the plasma signature and CSF YWHAG.1:NPTX2 in the Knight-ADRC and Stanford cohorts (n = 518). The colors indicate the CI stage as shown in a. Linear regression lines with 95% confidence intervals for cognitively normal versus cognitively impaired individuals are shown. d, Plasma signature versus neurofibrillary tau tangle load in the ROSMAP cohort, color-coded according to CI as in a. e, r2 values from linear models regressing CI against covariates displayed on the x axis in the ROSMAP cohort (n = 110). The bars and error bars represent bootstrapped (n = 1,000) means and 95% confidence intervals. Two-sided P values were calculated via the empirical distribution of the bootstrapped test statistic. The difference in r values between the two models is shown. ***P < 0.001. f, Results from a multivariate linear model regressing CI against the displayed covariates in the ROSMAP cohort (n = 110). The points and error bars represent the standardized effect size and 95% confidence interval. g, Cox proportional hazards regression was used to associate the plasma signature with future cognitive decline in individuals with MCI to mild dementia, while adjusting for APOE4, age, sex and CI stage. The results from a cross-cohort, fixed effects meta-analysis are shown (total n = 1,877). The points and error bars represent the HRs and 95% confidence intervals. h, As in g but for predicting dementia onset in cognitively normal individuals (total n = 4,753). CI stage was not included as a covariate because all individuals were cognitively normal. i, Cox proportional hazards regression was used to associate the plasma signature with future cognitive decline in all individuals across the Knight-ADRC, ROSMAP and Stanford cohorts, while adjusting for APOE4, age, sex and CI stage (n = 2,292). The points and error bars represent the HRs and 95% confidence intervals for each covariate. j, Kaplan–Meier curves with 95% confidence intervals showing the rates of future cognitive decline in plasma signaturehigh (top 25th percentile) versus plasma signaturelow (bottom 25th percentile) individuals. HR and 95% confidence interval are shown.

To assess whether the plasma signature, like CSF YWHAG:NPTX2, explained CI beyond Aβ and tau in AD, we used the ROSMAP cohort, which has collected comprehensive neuropathological and cognitive data from participants. We analyzed 110 individuals whose blood draws were within 2 years of death; autopsies confirmed a neuropathological diagnosis of AD (neuritic plaques Consortium to Establish a Registry for Alzheimer’s Disease score = probable or definite; Braak stage ≥ III). Plotting the plasma signature versus neurofibrillary tau tangle load, color-coded according to CI severity, we observed that high plasma signature levels were correlated with CI beyond tau levels (Fig. 6d). Linear regression showed that the plasma signature explained 36% of the variance in CI beyond neuritic Aβ plaque and tau tangle load (Fig. 6e). The association was robust to additional adjustment with age, sex, APOE4 dose and postmortem interval (Fig. 6f).

Next, we examined whether the plasma signature could be used to predict future cognitive decline, as with CSF YWHAG:NPTX2. In addition to the ROSMAP, Knight-ADRC and Stanford cohorts, we analyzed the ARIC study, a large independent cohort with SomaScan plasma proteomics and MCI and dementia diagnosis follow-up42. For each cohort (total n = 1,877), we used a Cox proportional hazards regression model to test the association between the plasma signature and a future increase in CI stage over 1–15 years among individuals with MCI to mild dementia, adjusting for baseline CI, age, sex and APOE4 dose. We did not have sufficient data to include plasma or CSF biomarkers of AD pathology. The plasma signature significantly predicted future cognitive decline across all cohorts; in a meta-analysis, an s.d. increase in the plasma signature conferred a 56% increased risk of cognitive decline (meta HR = 1.56, meta P = 3.16 × 10− 19; Fig. 6g and Supplementary Table 13).

We then examined conversion from cognitively normal to dementia (total n = 4,753) and found that an s.d. increase in the plasma signature conferred an 86% increased risk, adjusting for age, sex and APOE4 dose (meta HR = 1.86, meta P = 9.97 × 10− 17; Fig. 6h and Supplementary Table 14). Aggregating data across the ROSMAP, Knight-ADRC and Stanford cohorts (total n = 2,292), the plasma signature was among the strongest predictors among covariates (HR = 1.59, P = 1.00 × 10− 24; Fig. 6i), with age, baseline CI and APOE4 dose also having significant effects. Binary high and low groups based on the upper and lower 25th percentiles of the aggregated sample, as done with YWHAG:NPTX2, revealed that plasma signaturehigh individuals had a seven-time increased risk of future cognitive decline compared to plasma signaturelow individuals (HR = 7.17, P = 2.12 × 10− 64; Fig. 6j), with no additional covariate adjustment.

Together, these data show that plasma proteomics combined with machine learning can be used to derive a plasma-based protein signature that correlates with AD dementia independently of Aβ and tau and partly recapitulates CSF YWHAG:NPTX2.

Discussion

Overall, our findings reveal that synapse proteins in the CSF and plasma are among the strongest Aβ-independent and tau-independent correlates of CI in AD, and that from these synapse proteins emerges the CSF YWHAG:NPTX2 ratio, a sparse and robust correlate of CI. We found that YWHAG:NPTX2 increases with cognitively normal aging starting early in life and predicts AD onset and progression in both sporadic AD and ADAD across six independent deeply phenotyped AD cohorts. Most notably, we established YWHAG:NPTX2 thresholds that define five groups which predict future cognitive resilience versus decline among early AD A+T1+ individuals of the same cognitive diagnosis.

Although these findings indicate that CSF YWHAG:NPTX2 represents a biological process in the brain that is central to cognitive function in AD, what that process is remains unclear. Our data suggest it reflects a ‘pathology’ distinct from existing neurodegeneration and synaptic AD biomarkers NfL, GAP-43 and Ng. Based on the literature, we speculate that it relates to synapse dysfunction and neuronal hyperactivity-induced synapse loss caused by reduced expression of NPTX2. In human brains, the NPTX2 mRNA and protein are downregulated in AD neurons based on single-cell RNA sequencing, immunohistochemistry and bulk proteomics28,29, suggesting that its decrease with CI in AD CSF may reflect decreased expression. Furthermore, genetic loss of NPTX2 and NPTXR causes major GluA4 loss27, increased network hyperactivity27 and increased complement-mediated microglial engulfment of synapses30. While NPTX2 has been studied exclusively in neurons, it is worth noting that the gene is also highly expressed in the oligodendrocyte lineage in humans43.

The role of YWHAG in the brain is less understood, but the YWHA family of proteins localize in neuron bodies and synapses44, and YWHAG mutations cause childhood epilepsy45, implying regulation of neuronal activity. Furthermore, YWHA proteins physically regulate tau aggregation and phosphorylation46,47, and YWHA protein concentrations are dramatically increased in Creutzfeldt–Jakob disease CSF48, suggesting potential involvement in activity-dependent prion and tau propagation. YWHAG also binds to phosphatidylserine44, which is involved in synaptic pruning49. Future studies that collect CSF near death and perform molecular measurements from matched postmortem brains combined with mechanistic studies may illuminate what ‘pathology’ CSF YWHAG:NPTX2 and other CSF biomarkers represent.

In addition to reported roles of YWHAG and NPTX2 in neuronal hyperactivity and synapse dysfunction, our study shows that CSF YWHAG:NPTX2 is associated with several aspects of AD including CI, normal aging, genetically driven Aβ overproduction (ADAD) and tau accumulation, which together strongly implicate its relevance to synapse dysfunction. To elaborate, as with YWHAG:NPTX2, synapse loss is the most robust histological correlate of CI, beyond Aβ and tau50. Second, synapse dysfunction, morphological alterations and loss, rather than overt neuron loss, are major hallmarks of mammalian brain aging that are closely linked with age-related cognitive decline in nonhuman primates51. Third, Aβ oligomers cause synapse loss and neuronal hyperactivity50, akin to how ADAD mutations—which presumably lead to Aβ overproduction—are associated with a faster increase in YWHAG:NPTX2 with age. Lastly, neuronal hyperactivity enhances tau propagation40, which aligns with the positive association between YWHAG:NPTX2 and future Aβ-driven tau PET.

Together, these data suggest that CSF YWHAG:NPTX2 is probably a measure of synapse dysfunction perhaps related to loss of NPTX2-driven hyperactivity and YWHAG-driven tau toxicity, and point to synapse dysfunction as a promising therapeutic target to promote cognitive resilience in the presence of Aβ and tau. Strategies to restore NPTX2 expression to youthful levels may be especially promising, as overexpression of NPTX2 in tau P301S mice protects synapses from complement-mediated microglial engulfment30. NPTX2 is also reduced in frontotemporal dementia (FTD) and dementia with Lewy bodies CSF52, suggesting that its restoration may have benefits across multiple neurodegenerative diseases. Future studies are needed to determine whether CSF YWHAG:NPTX2 is correlated with CI independent of age and disease-specific pathologies across non-AD dementias, including FTD, dementia with Lewy bodies and amyotrophic lateral sclerosis.

Beyond biological and therapeutic implications, we show comprehensive evidence that CSF YWHAG:NPTX2 provides major prognostic utility in early AD well beyond established A/T/N biomarkers. We demonstrate it can be used to stratify both A+T1+ cognitively normal patients and A+T1+ patients with MCI into five low-risk and high-risk groups based on the thresholds provided in Methods. This stratification may aid in clinical trial target patient selection (that is, by selecting high-risk patients) and drug efficacy evaluation (that is, treatment response in different groups). The lack of cohort effects is especially advantageous for biomarker applications because no batch correction is needed, at least with the SomaScan assay. It is important to further validate our SomaScan YWHAG:NPTX2 group thresholds in independent cohorts, evaluate YWHAG:NPTX2 in comparison to tau PET in predicting future cognitive outcomes and examine changes in YWHAG:NPTX2 longitudinally in both the normal aging population and the population with early AD. Furthermore, developing affordable YWHAG:NPTX2 assays will be essential to overcome the high cost of the SomaScan assay and enable widespread clinical use.

Lastly, we show the development of a plasma proteomic signature of CI that partly recapitulates the characteristics of CSF YWHAG:NPTX2. Notably, the highest weighted proteins in the plasma signature are synapse proteins previously implicated as brain-specific proteins linked to brain aging53. Although more efforts are needed to improve the plasma signature, we expect that future advances in proteomics and machine learning will lead to sparse, scalable plasma surrogates of CSF YWHAG:NPTX2 to be used broadly for patient monitoring, clinical trials and research.

Methods

Participants

Stanford cohorts

Plasma and CSF collection, processing and storage for all Stanford cohorts followed a single standard procedure. All studies were approved by the Institutional Review Board (IRB) of Stanford University and written informed consent or assent was obtained from all participants or their legally authorized representatives.

Blood collection and processing followed a rigorous standardized protocol to minimize variation. Briefly, about 10 ml of whole blood was collected in four vacutainer EDTA tubes (Becton Dickinson) and spun at 1,800g for 10 min to separate out plasma, leaving 1 cm of plasma above the buffy coat to avoid contamination. Plasma was aliquoted into polypropylene tubes and stored at −80 °C. Processing took approximately 1 h from draw to freezing and storage. All draws occurred in the morning to minimize circadian effects.

CSF was collected via lumbar puncture using a 20–22 G spinal needle that was inserted in the L4–L5 or L5–S1 interspace. CSF samples were immediately centrifuged at 500g for 10 min, aliquoted in polypropylene tubes and stored at −80 °C.

Plasma from all Stanford cohorts was sent to SomaLogic for proteomics (v.4.1 SomaScan, ~7,000 proteins) in the same batch. CSF samples from all Stanford cohorts were sent to SomaLogic for proteomics (v.4.0 SomaScan, ~5,000 proteins) in the same batch. The core CSF AD biomarkers Aβ42, Aβ40 and pTau181 were measured using the fully automated LUMIPULSE G1200 instrument (Fujirebio) as described previously54,55. Descriptions for each cohort are provided below.

A total of 1,160 plasma samples (738 participants, longitudinal sampling) and 371 CSF samples (371 participants, one sample from each) from the Stanford cohorts were included in this study. Per-cohort sample sizes were as follows: ADRC plasma n = 827 (423 participants), CSF n = 113; SAMS plasma n = 222 (215 participants), CSF n = 169; Biomarkers in Parkinson’s Disease (BPD) plasma n = 55 (55 participants), CSF n = 68; Stanford Center for Memory Disorders (SCMD) cohort study plasma n = 45 (45 participants), CSF n = 21.

Stanford-ADRC study

Samples were acquired through the NIA-funded Stanford-ADRC, a longitudinal observational study of individuals with clinical dementia and age-matched and sex-matched individuals without dementia. Healthy controls were deemed cognitively unimpaired during a clinical consensus conference that included board-certified neurologists and neuropsychologists. Cognitively impaired individuals underwent CDR and standardized neurological and neuropsychological assessments, including the procedures of the National Alzheimer’s Coordinating Center (https://naccdata.org/), to determine cognitive and diagnostic status. Cognitive status was determined during a clinical consensus conference that included neurologists and neuropsychologists. All participants were free from acute infectious diseases and in good physical condition.

SAMS study

SAMS is an ongoing longitudinal study of healthy aging. Blood and CSF collection and processing, and neurological and neuropsychological assessments, were performed by the same team and followed the same protocol as in the Stanford-ADRC cohort. All SAMS participants had a CDR = 0 and a neuropsychological test score in the normal range; all SAMS participants were deemed cognitively unimpaired during a clinical consensus conference that included neurologists and neuropsychologists.

Stanford BPD cohort

The BPD cohort56 was a Michael J. Fox Foundation for Parkinson’s Research-funded longitudinal study of biomarkers associated with cognitive decline in Parkinson’s disease (PD). Research participants were recruited from the Stanford Movement Disorders Center between 2011 and 2015, with a PD diagnosis according to the UK Brain Bank criteria; they required bradykinesia with muscle rigidity or rest tremor. All participants completed baseline cognitive, motor, neuropsychological, imaging and biomarker assessments (plasma and optional CSF), including the Movement Disorders Society-revised Unified Parkinson’s Disease Rating Scale. Age-matched healthy controls were also recruited to control for age-associated biomarker changes. After a comprehensive neuropsychological battery, all participants were given a cognitive diagnosis of no CI, MCI or dementia, according to published criteria.

SCMD study

The SCMD was an NIA-funded cross-sectional study of people across the cognitive continuum. Participants with mild AD dementia and amnestic MCI were recruited from the Stanford Center for Memory Disorders between 2011 and 2015. Participants were included if they had a diagnosis of probable AD dementia (amnestic presentation) according to the NIA-Alzheimer’s Association57 criteria and a CDR score of 0.5 or 1, or a diagnosis of MCI according to the NIA-Alzheimer’s Association criteria57, a score of 1.5 s.d. below age-adjusted normative means on at least one test of episodic memory and a CDR score of less than 1. Older healthy controls were recruited from the community, were selected to have a similar average age as the enrolled patients and were required to have normal neuropsychological performance and a CDR of 0. Participants completed cognitive, neuropsychological, imaging and biomarker assessments with plasma.

Knight-ADRC study

The Knight-ADRC is an NIA-funded longitudinal observational study of individuals with clinical dementia and age-matched controls. Research participants undergo longitudinal cognitive, neuropsychological, imaging and biomarker assessments including a CDR. Cases with AD corresponded to those with a diagnosis of dementia of the Alzheimer’s type using criteria equivalent to the National Institute of Neurological and Communicative Disorders and Stroke (NINCDS)-Alzheimer’s Disease and Related Disorders Association (ADRDA) for probable AD; AD severity was determined using the CDR at the time of the lumbar puncture (for the CSF samples) or blood draw (for the plasma samples). Controls received the same assessment as cases but did not have dementia (CDR = 0).

Blood samples were collected in EDTA tubes (vacutainer purple top, Becton Dickinson) at the time of the visit, immediately centrifuged at 1,500g for 10 min, aliquoted on two-dimensional barcoded Micronic tubes (200 μl per aliquot) and stored at −80 °C. Plasma was stored in a monitored −80 °C freezer until it was pulled and sent to SomaLogic (SomaScan 7k) for data generation. Proteomics data from 2,112 plasma samples from each of the 2,122 participants were included in this study.

CSF samples were collected through lumbar puncture from participants after an overnight fast. Samples were processed and stored at −80 °C until they were sent for protein measurement. Proteomics data from 927 CSF samples from each of 927 participants were included in this study. CSF samples from the Knight-ADRC, ADNI and DIAN cohorts were sent for proteomics using the SomaScan platform (SomaScan 7k) in the same batch. CSF Aβ42, Aβ40 and pTau181 were measured using the LUMIPULSE G1200 immunoassay platform according to the manufacturer’s specifications.

The IRB of Washington University School of Medicine in St. Louis approved the study and research was performed in accordance with the approved protocols.

ADNI study

ADNI is a longitudinal multicenter study designed to develop early biomarkers of AD. All data used in this study were accessed from the ADNI database (https://adni.loni.usc.edu/). ADNI investigators assessed and diagnosed individuals as either cognitively normal (Mini Mental State Examination (MMSE) ≥ 24, CDR = 0, nondepressed), MC (MMSE ≥ 24, CDR = 0.5, objective memory impairment on education-adjusted Wechsler Memory Scale-II, preserved activities of daily living) or with dementia (MMSE = 20–26, CDR > 0.5, NINCDS/ADRDA criteria for probable AD). Comprehensive details on study design, data acquisition, ethics and policies are described above. CSF Aβ42 and pTau181 were measured using the xMAP immunoassay platform according to the manufacturer’s specifications. Proteomics data from 725 CSF samples from each of 725 participants were included in this study.

DIAN study

DIAN, led by Washington University School of Medicine in St. Louis, is a family-based long-term observational study designed to understand the earliest changes of ADAD. All participants undergo clinical and cognitive batteries (that is, global CDR). Comprehensive details on study design, data acquisition, ethics and policies can be found at https://dian.wustl.edu/. The data used in this study are from data freeze 15. CSF Aβ42 and pTau181 were measured using the LUMIPULSE G1200 platform according to the manufacturer’s specifications. Proteomics data from 455 CSF samples from each of 455 participants were included in this study.

BioFINDER2 study

BioFINDER2 is a Swedish prospective cohort study (NCT03174938) on age-related neurodegenerative diseases. AD was diagnosed based on the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition and Aβ positivity (CSF Aβ42:Aβ40). Proteomics data from a total of n = 829 participants, consisting of n = 480 cognitively unimpaired, n = 213 with MCI and n = 136 with AD dementia were included in this study. Global CDR scores were not measured in BioFINDER2, so for estimation, participants with AD were subdivided according to the MMSE thresholds defined in ref. 58: 21–25 for mild dementia (CDR = 1); 11–20 for moderate dementia (CDR = 2); and 0–10 for severe dementia (CDR = 3). Participants were recruited at Skåne University Hospital and Ängelholm Hospital. The study was approved by the Regional Ethical Committee in Lund, Sweden; all participants gave written informed consent.

CSF samples were collected close in time after baseline clinical examination and handled according to established preanalytical protocols, previously described in detail in ref. 59. All analyses were performed by technicians blinded to all clinical and imaging data. CSF pTau181, Aβ42 and Aβ40 were measured using Elecsys assays in accordance with the manufacturer’s instructions (Roche Diagnostics). CSF Aβ42/Aβ40 was used to define Aβ positivity according to a previously established cutoff of <0.08 (ref. 60). CSF samples from the BioFINDER2 cohort were analyzed with liquid chromatography–tandem MS (LC–MS/MS), previously described in detail in ref. 32.

Tau PET was performed using [18F]RO948. SUVR images were created for the 70–90-min postinjection interval using the inferior cerebellar cortex as the reference region. A composite corresponding to a Braak I–IV meta-region of interest was used to represent AD-related tau tangle pathology.

Kuopio University Hospital

The Kuopio Normal Pressure Hydrocephalus (NPH) and AD Registry and Tissue Bank includes patients from the Eastern Finnish population who were referred to the Kuopio University Hospital neurosurgical unit for suspected NPH. The registry’s inclusive criteria encompass a wide range of hydrocephalic conditions and comorbidities: patients must exhibit 1–3 symptoms potentially associated with NPH (such as impaired gait, cognition or urinary continence) along with enlarged brain ventricles (Evans index > 0.3) as seen on computed tomography or magnetic resonance imaging, and no other clear cause that alone explains the observed findings and symptoms. Preoperative comorbidities and conditions were recorded at baseline; patients underwent a systematic differential diagnostic workup followed by a CSF tap test paired with gait evaluation.

A diagnostic right frontal cortical brain biopsy was taken from all patients during shunt surgery from the insertion site of the intraventricular catheter. A neuropathologist (T.R.) analyzed immunoreactivity for Aβ and hyperphosphorylated tau with light microscopy; the results were graded as present or absent. Patients were then further classified into groups according to the presence of the pathology of Aβ or HPτ observed in the frontal cortical biopsies61. Follow-up was conducted on all operated patients, with optimal shunt function ensured through valve adjustment, brain imaging, shunt valve tapping, lumbar infusion testing and shunt revision if necessary. Global CDR scores were not measured; the Consortium to Establish a Registry for Alzheimer’s Disease cognitive score62, including the MMSE, was used instead. The latest follow-up MMSE score less than 26 was considered as at least mild dementia61.

Lumbar CSF proteomics was performed using high-throughput tandem mass tag-labeling MS, previously described in detail in ref. 63. Data from 90 individuals with CSF proteomics and cognitive scoring performed within 1 year were included in this study.

The study was conducted according to the 1964 Declaration of Helsinki and its later amendments (2013) and all patients provided written informed consent. The Research Ethics Committee of the Northern Savo Hospital District (decision no. 276/13.02.00/2016) approved the study.

ROSMAP studies

All ROSMAP participants enrolled without known dementia and agreed to detailed clinical evaluation and brain donation at death64. Both studies were approved by the IRB of Rush University Medical Center (no. L91020181, MAP IRB no. L86121802). Both studies were conducted according to the principles expressed in the Declaration of Helsinki. Each participant signed an informed consent, Anatomical Gift Act and a Rush Alzheimer’s Disease Center (RADC) Repository consent (IRB no. L99032481) allowing their data and biospecimens to be repurposed. All MAP participants and a subset of ROS have blood drawn during an annual home visit. For plasma, blood is drawn in a lavender (purple) top EDTA tube. For out of town ROS sites, they were spun, aliquoted into Nunc vials, stored in dry ice and sent to RADC by FedEx, where they were transferred to −80 °C. Samples collected in northeastern Illinois were brought to the RADC laboratory and processed there using the same procedures. A total of 1,046 55-µl samples were shipped to Stanford, then to SomaLogic for proteomics (SomaScan 7k); 973 samples passed quality control.

Clinical and neuropathological data collection has been reported in detail elsewhere10,65,66,67,68. Regarding clinical diagnosis, an actuarial decision tree designed to mimic expert clinical judgment was implemented by computer to inform several clinical diagnoses, including dementia and AD. It combined data reduction techniques for cognitive performance testing, with a series of discrete clinical judgments made in series by a neuropsychologist and a clinician. Presumptive diagnoses of dementia and AD were calculated that conformed to accepted clinical criteria. The clinician was asked to agree or disagree with the decisions. An algorithm used these decisions to provide diagnoses of MCI and amnestic MCI. Persons with MCI were judged to have CI by the neuropsychologist, and without a diagnosis of dementia by the clinician. Persons without dementia or MCI were categorized as having no CI.

Global CDR scores were not assessed in ROSMAP. Rather, participants underwent a battery of 21 cognitive performance tests, 17 of which were combined into a measure of global cognition which was z-scored based on the mean and s.d. of all participants from baseline69. For estimation, participants were subdivided according to global cognition z-scores: cognitively normal = z-score > 0; MCI = −1 < z-score < 0; mild dementia = −2 < z-score < =−1; moderate dementia = −3 < z-score < = −2; and severe dementia z-score < = −3. These cutoffs were set based on the distributions of global cognition z-scores per clinical diagnosis. Details on cognitive scores, neuropathology and other patient information are described at https://www.radc.rush.edu/documentation.htm. Proteomics data from 973 plasma samples from each of 890 participants were included in this study.

GNPC

The GNPC is a major neurodegenerative disease biomarker discovery effort, which hosts the largest collection of SomaScan data (over 40,000 patient samples from over 20 international research groups) from patient samples across healthy aging, AD, PD, amyotrophic lateral sclerosis and FTD. All cohorts and data are anonymized. Cohorts that had global CDR cognitive scores were included in this study.

ARIC study

ARIC is a prospective epidemiological study conducted in four US communities: Forsyth County, NC; Jackson, MS; the northwest suburbs of Minneapolis, MN; and Washington County, MD. The ARIC study enrolled 15,792 mostly White and Black participants aged 45–64 between 1987 and 1989 (ref. 70). After initial enrollment, participants had four additional in-person visits: visit 2 (1990–1992); visit 3 (1993–1995); visit 4 (1996–1999); and visit 5 (2011–2013). Participants were invited back for visit 6 (2016–2017) after 5 years, and visit 7 (2018–2019) immediately thereafter. Additional follow-up visits are ongoing. Blood was drawn for proteomic analysis at visits 2 and 5. For this paper, the late-life (8-year) dementia risk was assessed between visits 5 and 7. Plasma was collected using standardized protocols and frozen at −80 °C until analysis. Proteins were measured using the SomaScan v.4.0 assay; protein quality control steps have been described in detail previously42.

Dementia was adjudicated through cognitive assessment tests, telephone screening, informant ratings, hospital records and death record review, as described previously71. From visits 2 through 4, a three-instrument cognitive assessment was applied (delayed word recall task, digit symbol substitution from the Wechsler Adult Intelligence Scale-Revised and a letter fluency task). At visits 5 through 7, a surveillance approach was used, where participants received a comprehensive cognitive exam and a functional assessment that included the Clinical Dementia Rating Scale and Functional Activities Questionnaire. With these data, dementia was classified based on the NIA, Alzheimer’s Association and Diagnostic and Statistical Manual of Mental Disorder, Fifth Edition criteria. In the time between visits 5 and 6, participants were contacted annually via phone and administered the Six-item Screener (SIS), a brief cognitive assessment. If participants received a low score on the SIS, or if they were unable to participate in the screening via phone, the Ascertain Dementia 8 (AD8) was administered to the participant’s informant. For participants who received a dementia diagnosis at visit 6 or 7, SIS, AD8, hospital discharge and death certificate codes were used to define the date of dementia onset. For participants who did not attend visits 6 or 7, SIS, AD8, hospital discharge and death certificate codes were used to define the dementia diagnosis and date of dementia onset.

The ARIC study protocols were approved by the IRBs at each participating center: University of North Carolina at Chapel Hill; Wake Forest University; Johns Hopkins University; University of Minnesota; and University of Mississippi Medical Center. All ARIC participants gave written informed consent at each study visit; proxies provided consent for participants who were judged to lack capacity.

Proteomics

The SomaLogic (https://somalogic.com/) SomaScan assay72,73, which uses slow off-rate modified DNA aptamers (SOMAmers) to bind target proteins with high specificity, was used to quantify the relative concentration of thousands of human proteins in plasma and CSF in the Stanford, Knight-ADRC, ADNI, DIAN and ROSMAP cohorts. The v.4.1 (~7,000 proteins) assay was used for all the mentioned cohorts and samples, except for the Stanford CSF, for which the v.4.0 (~5,000 proteins) assay was used. Standard SomaLogic normalization, calibration and quality control were performed on all samples, resulting in protein measurements in relative fluorescence units. Plasma samples were further normalized to a pooled reference using an adaptive maximum likelihood procedure. The resulting values are the data from SomaLogic and are considered ‘raw’ data. We further performed log10 normalization as the assay had an expected log-normal distribution. No cohort batch corrections were applied. Acetylcholinesterase was removed before the analyses because it can be confounded with scetylcholinesterase inhibitor treatment. CSF samples from the BioFINDER2 cohort were analyzed with LC–MS/MS, previously described in detail in ref. 32. CSF samples from the Kuopio cohort were analyzed with high-throughput tandem mass tag-labeling MS, previously described in detail in ref. 63.

CI stage classification

CI stages reflect global CDR scores. CDR scores of 0, 0.5, 1, 2 and 3 are synonymous with the CI stages of none, MCI, mild dementia, moderate dementia and severe dementia, respectively. The Stanford, Knight-ADRC, ADNI and DIAN cohorts measured global CDR scores. BioFINDER2, ROSMAP and Kuopio did not measure global CDR scores, so we estimated global CDR scores based on cognitive battery tests and clinical diagnoses as described in the sections for each cohort.

A+T1+ versus A–T1− classification

Typically, ‘A’ positivity is defined by the levels of Aβ42 and ‘T1’ positivity by pTau181, using a separate Gaussian mixture model for each biomarker to derive value cutoffs74 (Supplementary Fig. 1a). This leads to four possible groups: A–T1−, A+T1−, A–T1+ and A+T1+. However, this classification system does not fit the ‘shape’ of the data and artificially increases the number of A–T1+ individuals75 (Supplementary Fig. 1a), as the frequency of A–T+ individuals based on PET imaging biomarkers (the gold standard) are extremely rare75. To overcome this limitation, we used the CSF pTau181:Aβ42 ratio, which better fits the shape of the data (Supplementary Fig. 1b), to define A–T1− versus A+T1+ status (log10 pTau181:Aβ42 cutoff = − 1, based on Lumipulse or xMAP; different cutoffs were used for Elecsys; Supplementary Fig. 1c). Previous studies showed that pTau181:Aβ42 appropriately captures A–T1− versus A+T1+ status15,16. Aβ positivity, regardless of T status, was determined using the CSF Aβ42:Aβ40 ratio or Aβ PET (gold standards).

Statistical analyses

While some cohorts included multiple plasma samples from the same individual (precise numbers are described in the cohort sections), all analyses in this study were performed using proteomics data from only a single time point per individual. Only one CSF sample was collected per individual. For cross-sectional associations with CI, the most recent plasma sample was used to maximize the sample size of cases with dementia, which were fewer than cognitively normal cases. For analyses involving the prediction of future cognitive decline from a cognitively normal or early AD baseline, the earliest plasma sample was used to maximize sample size.

The NumPy76 and Pandas77 Python packages were used for data processing and transformation; the Matplotlib78 and seaborn79 Python packages were used for plotting.

Linear regression

The stats.pearsonr function from the SciPy80 Python package was used to assess the Pearson correlations. The ordinary least squares (OLS) function from the statsmodels81 Python package was used to assess the linear associations between protein levels and CI. For the unbiased proteome-wide association tests in Fig. 1b, we tested the following linear model for each protein: CI ~ protein + CSF pTau181:Aβ42 + age + sex + APOE4 dose + cohort + PC1. We included the PC1 of the proteome as a covariate because previous studies showed that it represents a large source of non-disease-related variance, potentially related to heterogeneity in CSF production and clearance rates35,75. Inclusion of PC1 ‘de-noised’ the data and greatly improved the significance of protein associations with CI in every cohort we assessed. Multiple hypothesis testing correction was applied using the Benjamini–Hochberg method, and the significance threshold was set at a 5% false discovery rate (q < 0.05). All other linear regression analyses in the manuscript relied on the same OLS function. Precise covariates used per analysis are displayed in the figures or described in the main text. The proportion of CI variance explained by certain variables was determined using the r2 values from the OLS models. Bootstrapping (n = 1,000) was applied to derive the 95% confidence intervals and P values for variable (that is, YWHAG:NPTX2 versus pTau181:Aβ42) comparisons.

Logistic regression

The LogisticRegression function from the scikit-learn82 Python package was used to assess the CI classification based on YWHAG:NPTX2 or pTau181:Aβ42 (Fig. 1g and Extended Data Fig. 1e). Models were tested to distinguish MCI versus cognitively normal, mild dementia versus MCI, moderate-to-severe versus mild dementia and mild dementia or worse versus cognitively normal. The AUC, accuracy, sensitivity, and specificity were calculated using the confusion_matrix, accuracy_score, recall_score, roc_curve and roc_auc functions from scikit-learn82. Bootstrapping (n = 1,000) was applied to derive the 95% confidence intervals and P values for YWHAG:NPTX2 versus pTau181:Aβ42 comparisons.

Cox proportional hazards regression

The CoxPHFitter function from the lifelines83 Python package was used to assess the associations between CSF YWHAG:NPTX2 and future cognitive decline (Figs. 4e–k, 5b,c and 6g–j. An event of cognitive decline was defined as a stage increase in CI (that is, none to MCI, or MCI to mild dementia). An event of conversion from cognitively normal to dementia was defined as a two-stage or more increase in CI from a cognitively normal baseline (none to mild dementia). Additional covariates such as baseline age, sex, APOE4 dose, CSF pTau181:Aβ42, CSF NfL, CSF Ng and CI were included depending on the analysis. The precise covariates used for each analysis are displayed in the figures or in the text. Meta-analyses to compare and aggregate effect sizes and confidence intervals from multiple cohorts were performed in R using the metafor84 package, with an inverse-variance-weighted fixed effects model.

Biological pathway enrichment analyses

gProfiler85 was used for Gene Ontology (GO) enrichment analyses (Fig. 1c). The GO86,87 database includes the SynGO19 database used to subset the synapse proteins; 6,379 unique protein-encoding genes detected by SomaScan were used as background in gProfiler.

Derivation of CSF YWHAG:NPTX2 ratio

The LassoCV function from the scikit-learn82 Python package was used to train, in the ADNI cohort, a penalized linear model to predict CI severity based on the levels of 214 synapse proteins that significantly changed with CI in the ADNI and Knight-ADRC cohorts (Fig. 1b). Fivefold cross-validation was implemented to identify the optimal lambda parameter. The RFECV and RFE functions from scikit-learn82 were used to perform RFE on the LassoCV model to further simplify the model to facilitate clinical applications. RFECV showed that two proteins sufficiently captured most of the signal in the model. RFE was used to derive a model with two proteins, which resulted in the normalized ratio between YWHAG and NPTX2. Details on testing the YWHAG:NPTX2 in independent cohorts are provided in the ‘Code availability’ section of the article.

Derivation of plasma signature of CI

Full details are shown in Supplementary Methods. Briefly, the LassoCV function from the scikit-learn82 Python package was used to train, in the Knight-ADRC and ROSMAP cohorts, a penalized linear model to predict CI severity based on the levels of 745 plasma proteins. Fivefold cross-validation was implemented to identify the optimal lambda parameter. We call this model the ‘plasma signature’ throughout the paper. Model weights and intercept are provided in Supplementary Table 7 and details on testing the signature in independent cohorts are provided in the ‘Code availability’ section of the article.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.